Feed model_id and variant_label to recipe#309
Conversation
Better comment Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Remove comment Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Better comment Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
…and variant_label
|
@alistairsellar could you please provide feedback on the errors encountered. These are not related to the implementation but not availability of data (I think, but not an expert on this!) When running cylc, run_recipe_radiation_budget failed We get this error from ESMValtool.
|
Hrmmm, it is indeed looking there. However it shouldn't be looking there, since that's our mirror of CMIP data and we now want CMEW to be feeding the locally standardised data into ESMValTool, not CMIP data. So ESMValTool should be looking in the cylc-run dir for the standardised data. |
model_id and variant_label to recipe
alistairsellar
left a comment
There was a problem hiding this comment.
Thanks @mo-nikosbaltas, looks good. My requested changes relate to comments and docstrings only.
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
…b.com:MetOffice/CMEW into 287-feed-model_id-and-variant_label-to-recipe
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
alistairsellar
left a comment
There was a problem hiding this comment.
Apologies, my suggestions included trailing spaces, which has broken the GitHub checks. These suggestions remove some of them - hopefully all...
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
| "one for the reference and one for the evaluation run." | ||
| ) | ||
|
|
||
| # Reference dataset: keep existing project/exp/grid but override |
There was a problem hiding this comment.
I don't think it is keeping the existing project or exp? I think it's overwriting them?
There was a problem hiding this comment.
I have read your explanation of why project and exp are being overwritten, but this comment still says that they are not. I still think the comment needs changing.
| { | ||
| "dataset": ref_model_id, | ||
| "project": "ESMVal", | ||
| "exp": "amip", |
There was a problem hiding this comment.
Here, I think we're changing this from "historical". Is it deliberate? If so, the comment should be updated.
There was a problem hiding this comment.
Ooh, I should have picked this up in first review. Actually I think that experiment might be wrong for both runs, including the original. For this recipe (radiation budget) the choice of experiment doesn't make a difference, but it will for some recipes, so experiment should be something that the user defines as part of the model run definition. I've just opened an issue to add that: #316.
For this PR, I propose that we accept that the second run is no more wrong than the first, and that having them consistently called "amip" is as good as any choice. I.e. I propose that we keep "exp": "amip" for both.
There was a problem hiding this comment.
I still think the comment should reflect what's going on, but I will take note to pay more attention to the unchanged code in a review next time.
There was a problem hiding this comment.
I had spent some time debugging the failure last week when implementing the #287 and I dug out the logs I had kept, so here is the explanation for completion.
There was a failure because the reference dataset was treated as a CMIP6 “historical” run in the recipe, while CDDS had standardised it as a GCModelDev / ESMVal / amip run.
From the ESMValTool log for the reference dataset (dataset index 0, HadGEM3):
'dataset': 'HadGEM3-GC31-LL',
'project': 'ESMVal',
'mip': 'Amon',
'short_name': 'hfls',
'activity': 'CMIP',
'alias': 'None',
'ensemble': 'r5i1p3f3',
'exp': 'historical',
...
So, after executing update_recipe_file.py:
• ‘project’ has been changed to ESMVal
• ‘ensemble’ is r5i1p1f3 (from REF_VARIANT_LABEL)
• But:
o ‘exp’ was still historical
o ‘activity’ was still CMIP
Now looking at where ESMValTool is searching for files (in the logs):
Looked for files matching
/home/users/nikolaos.baltas/cylc-run/CMEW_287/test287c/share/work/GCModelDev/CMIP/MOHC/HadGEM3-GC31-LL/historical/r5i1p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
/home/users/nikolaos.baltas/cylc-run/CMEW_287/test287c/share/work/GCModelDev/CMIP/NERC/HadGEM3-GC31-LL/historical/r5i1p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
Key bits:
• Path includes GCModelDev/CMIP/.../historical/r5i1p1f3/...
• This is driven by ‘activity: CMIP’ and ‘exp: historical’.
However, the CDDS request (from create_request_file.py) uses:
"experiment_id": "amip",
and is run twice (REF and EVAL) via standardise_model_data. So CDDS is standardising:
• GCModelDev/ESMVal//amip//...
for both runs.
That means:
• CDDS has produced ‘amip’ data
• ESMValTool is still looking for ‘historical’ data for the reference dataset
• Hence: “No input files found for Dataset ... historical ...”
The evaluation dataset works fine because we had explicitly set:
'project': 'ESMVal',
'activity': 'ESMVal',
'exp': 'amip',
'ensemble': 'r1i1p1f1',
...
and ESMValTool finds (from logs):
Found input files for Dataset: hfls, Amon, ESMVal, UKESM1-0-LL, ESMVal, amip, r1i1p1f1, gn, v20251230
So, the fix was to make the reference dataset use the GCModelDev/ESMVal “amip” semantics too.
Need to also override ‘exp’ and ‘activity’ for the ‘reference dataset’ in the same way we do for the evaluation dataset.
I updated that block to:
# Reference dataset: treat as a GCModelDev / ESMVal / amip run,
ref_dataset = datasets[0]
ref_dataset.update(
{
"dataset": ref_model_id,
"project": "ESMVal",
"exp": "amip",
"activity": "ESMVal",
"ensemble": ref_variant,
"start_year": start_year,
"end_year": end_year,
}
)
# Evaluation dataset: ESMVal / amip run using MODEL_ID + VARIANT_LABEL
eval_dataset = datasets[1]
eval_dataset.update(
{
"dataset": eval_model_id,
"project": "ESMVal",
"exp": "amip",
"activity": "ESMVal",
"ensemble": eval_variant,
"start_year": start_year,
"end_year": end_year,
}
)
That aligns both datasets with:
• project: ESMVal
• activity: ESMVal
• exp: amip
which matches what CDDS is actually producing from create_request_file.py.
I hope this answers the question of why overriding 'project' and 'exp' . Now if this is the correct approach we need to investigate further.
There was a problem hiding this comment.
I am happy that this comment is resolved from my perspective, but will leave open for @mo-nikosbaltas to close if he and @alistairsellar are satisfied that the "investigate further" aspect has been / is elsewhere addressed.
|
@mo-nikosbaltas some of the review comments are just queries, but if I'm correct that we are overwriting the experiment (from "historical" to "amip", then the comment should reflect this (the recipe does run successfully with the change, but is the data that it assesses the same data?) |
NParsonsMO
left a comment
There was a problem hiding this comment.
The comments on line 80 and line 95 of CMEW/app/configure_for/bin/update_recipe_file.py do not match what is happening.
| "one for the reference and one for the evaluation run." | ||
| ) | ||
|
|
||
| # Reference dataset: keep existing project/exp/grid but override |
There was a problem hiding this comment.
I have read your explanation of why project and exp are being overwritten, but this comment still says that they are not. I still think the comment needs changing.
|
Note: I "mentioned" this issue due to a clipboard paste fail. It is not related to the other issue (#282). Sorry! |
NParsonsMO
left a comment
There was a problem hiding this comment.
I am happy that the comments are no longer confusing
| { | ||
| "dataset": ref_model_id, | ||
| "project": "ESMVal", | ||
| "exp": "amip", |
There was a problem hiding this comment.
I am happy that this comment is resolved from my perspective, but will leave open for @mo-nikosbaltas to close if he and @alistairsellar are satisfied that the "investigate further" aspect has been / is elsewhere addressed.
alistairsellar
left a comment
There was a problem hiding this comment.
Thanks @mo-nikosbaltas and @NParsonsMO, all good for me.
Yes, I think that any further investigation needed is covered by #316.
Closes #287
PR creation checklist for the developer
<issue_number>above ☝️ been replaced with the issue number?mainbeen selected as the base branch?<issue_number>_<short_description_of_feature>?good first issuelabel) been added to the PR?Climate Model Evaluation Workflow (CMEW)project been added to the PR?Definition of Done for the developer
docdirectory) related to the change been updated appropriately,including the [Quick Start] https://github.com/MetOffice/CMEW/blob/main/doc/source/user_guide/quick_start.rst) section? N/APR creation checklist for the reviewer
<issue_number>above ☝️ been replaced with the issue number?mainbeen selected as the base branch?<issue_number>_<short_description_of_feature>?good first issuelabel) been added to the PR?Climate Model Evaluation Workflow (CMEW)project been added to the PR?Definition of Done for the reviewer
docdirectory) related to the change been updated appropriately, including the Quick Start section?